Database Schema Matching using Corpus-based Semantic Similarity and Word Segmentation
نویسندگان
چکیده
In this paper, we present a new method for database schema matching, the problem of identifying elements of two given schemas that correspond to each other. We use two methods based on a large text corpus: one method for determining the semantic similarity of two target words and the other for automatic word segmentation. We present a name-based element-level database schema matching method that exploits the semantic similarity and the word segmentation method. We also use normalized and modified versions of the Longest Common Subsequence string matching algorithm with weight factors to allow for a balanced combination. Our goal is to develop a schema matching method that uses a single property (element name) for matching and achieves a comparable F-measure score with respect to the methods that use multiple properties (element name, text description, data instance, context description). We validate our method with experimental studies, the results of which suggest that the method is a useful addition to the set of existing schema matchers.
منابع مشابه
An Improved Semantic Schema Matching Approach
Schema matching is a critical step in many applications, such as data warehouse loading, Online Analytical Process (OLAP), Data mining, semantic web [2] and schema integration. This task is defined for finding the semantic correspondences between elements of two schemas. Recently, schema matching has found considerable interest in both research and practice. In this paper, we present a new impr...
متن کاملSearching XML Databases for Semantically-related Schemas
In this paper, we address the problem of searching schema databases for semantically-related schemas. We first give a method of finding semantic similarity between pair-wise schemas based on tokenization, part-of-speech tagging, word expansion, and ontology matching. We then address the problem of indexing the schema database through a semantic hash table. Matching schemas in the database are f...
متن کاملA procedure for Web Service Selection Using WS-Policy Semantic Matching
In general, Policy-based approaches play an important role in the management of web services, for instance, in the choice of semantic web service and quality of services (QoS) in particular. The present research work illustrates a procedure for the web service selection among functionality similar web services based on WS-Policy semantic matching. In this study, the procedure of WS-Policy publi...
متن کاملDeveloping a Semantic Similarity Judgment Test for Persian Action Verbs and Non-action Nouns in Patients With Brain Injury and Determining its Content Validity
Objective: Brain trauma evidences suggest that the two grammatical categories of noun and verb are processed in different regions of the brain due to differences in the complexity of grammatical and semantic information processing. Studies have shown that the verbs belonging to different semantic categories lead to neural activity in different areas of the brain, and action verb processing is r...
متن کاملChinese Entity Relation Extraction Based on Word Co-occurrence
Chinese entity relation extraction is a part of entity relation extraction. According to entity relation extraction technology and the features of Chinese news corpus, this paper proposes a novel method for Chinese entities relation extraction. The method, named WCORE (word co-occurrence relation extraction), first measures the semantic similarity by word co-occurrence and then adopts pattern m...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2007